Sains Malaysiana 55(3)(2026): 491-501

http://doi.org/10.17576/jsm-2026-5503-11  

A Mini-Batch Algorithm with Adaptive Learning Rate Strategy
(Algoritma Kelompok Mini dengan Strategi Kadar Pembelajaran Adaptif)

WEIJUAN SHI1, ADIBAH SHUIB2,* & ZURAIDA ALWADOOD2

 

1College of Mathematics and Finance, Hunan University of Humanities, Science and Technology, Loudi, China

2Faculty of Computer and Mathematical Sciences (FSKM), Universiti Teknologi MARA (UiTM), 40450 Shah Alam, Selangor, Malaysia

Received: 31 December 2024/Accepted: 19 February 2026

Abstract

To address the limitations of manually selecting step sizes or using diminishing step size sequences, which can slow convergence in mini-batch algorithms, we propose a strategy for automatically calculating step sizes by employing the Positive Defined Stabilized Barzilai-Borwein (PDSBB) method. The PDSBB step size is integrated into the mini-batch semi-stochastic gradient descent (mS2GD) algorithm, creating a novel algorithm called mS2GD-PDSBB. Based on the linear convergence result, the computational complexity is characterized in terms of the expected number of stochastic gradient evaluations required to achieve a prescribed accuracy level. Computational experiments on benchmark instances are conducted to evaluate the convergence behavior of the proposed algorithm. Suitable mini-batch size leads the mS2GD-PDSBB algorithm to successfully attain the performance consistent to the base algorithms. The numerical experiments demonstrate that the proposed mS2GD-PDSBB algorithm achieves stable and fast convergence with the adaptive step-size strategy. In particular, the algorithm shows reduced sensitivity to the choice of initial step sizes and consistently outperforms or matches mS2GD and mS2GD-BB in terms of objective sub-optimality and test error across different dataset.

Keywords: Adaptive step size; convergence rate; mS2GD algorithm; PDSBB method

Abstrak

Untuk menangani batasan pemilihan saiz langkah secara manual atau penggunaan jujukan saiz langkah yang semakin berkurangan, yang boleh memperlahankan penumpuan dalam algoritma kelompok mini, kami mencadangkan strategi untuk mengira saiz langkah secara automatik dengan menggunakan kaedah Positive Defined Stabilized Barzilai-Borwein (PDSBB). Saiz langkah PDSBB disepadukan ke dalam algoritma penurunan kecerunan separa stokastik kelompok mini (mS2GD), mewujudkan algoritma baharu yang dipanggil mS2GD-PDSBB. Berdasarkan hasil penumpuan linear, kerumitan pengiraan dicirikan dari segi bilangan penilaian kecerunan stokastik yang dijangka yang diperlukan untuk mencapai tahap ketepatan yang ditetapkan. Uji kaji pengiraan menggunakan contoh penanda aras dijalankan untuk menilai tingkah laku penumpuan algoritma yang dicadangkan. Saiz kelompok mini yang sesuai membawa algoritma mS2GD-PDSBB untuk berjaya mencapai prestasi yang konsisten dengan algoritma asas. Uji kaji berangka menunjukkan bahawa algoritma mS2GD-PDSBB yang dicadangkan mencapai penumpuan yang stabil dan pantas menerusi strategi saiz langkah adaptif. Secara khususnya, algoritma ini menunjukkan sensitiviti yang berkurangan terhadap pilihan saiz langkah awal dan secara konsisten mengatasi atau menyamai mS2GD dan mS2GD-BB dari segi sub-keoptimuman objektif dan ralat ujian merentasi set data yang berbeza.

Kata kunci: Algoritma mS2GD; kaedah PDSBB; kadar penumpuan; saiz langkah adaptif

 

REFERENCES

Barzilai, J. & Borwein, J.M. 1988. Two-point step size gradient methods. IMA Journal of Numerical Analysis 8(1): 141-148.

Berahas, A.S., Nocedal, J. & Takáč, M. 2016. A multi-batch L-BFGS method for machine learning. Advances in Neural Information Processing Systems. pp. 1063-1071.

Chen, G., Li, Y., Zhang, J. & Huang, K. 2022. A survey of the four pillars for small object detection. Foundations and Trends in Computer Graphics and Vision 14(1-2): 1-145.

Condat, L. 2023. Proximal splitting algorithms for convex optimization: A tour of recent advances. SIAM Review 65(3): 699-763.

Hosny, S., Shouman, M.A. & Ali, A.A. 2023. Survey on compressed sensing over the past two decades. Array 19: 100308.

Khan, S., Naseer, M., Hayat, M., Zamir, S.W., Khan, F.S. & Shah, M. 2022. Transformers in vision: A survey. ACM Computing Surveys 54(10): 200.

Konečný, J., Liu, J., Richtárik, P. & Takáč, M. 2016. Mini-batch semi-stochastic gradient descent in the proximal setting. IEEE Journal on Selected Topics in Signal Processing 10(2): 242-255.

Li, M., Zhang, T., Chen, Y. & Smola, A.J. 2014. Efficient mini-batch training for stochastic optimization. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 661-670.

Lin, T., Wang, Y., Liu, X. & Qiu, X. 2022. A survey of transformers. AI Open 3: 111-132.

Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X. & Pietikäinen, M. 2020. Deep learning for generic object detection: A survey. International Journal of Computer Vision 128: 261-318.

Ma, K., Zeng, J., Xiong, J., Xu, Q., Cao, X., Liu, W. & Yao, Y. 2018. Stochastic non-convex ordinal embedding with stabilized Barzilai-Borwein step size. Proceedings of the 32nd AAAI Conference on Artificial Intelligence 458: 3738-3745.

Mittal, P., Ghosh, A. & Singh, S.K. 2024. A comprehensive survey of deep learning-based lightweight object detection on edge devices. Artificial Intelligence Review 57: 242.

Nagahara, M. 2024. A survey on compressed sensing approach to systems and control. SN Computer Science 5(5): 442.

Nesterov, Y. 2013. Gradient methods for minimizing composite functions. Mathematical Programming 140(1): 125-161.

Nguyen, L.M., Liu, J., Scheinberg, K. & Takáč, M. 2017. SARAH: A novel method for machine learning problems using stochastic recursive gradient. Proceedings of the 34th International Conference on Machine Learning. pp. 4009-4022.

Parikh, N. & Boyd, S. 2014. Proximal algorithms. Foundations and Trends in Optimization 1(3): 127-239.

Reddi, S.J., Hefny, A., Sra, S., Póczós, B. & Smola, A. 2016. Stochastic variance reduction for nonconvex optimization. Proceedings of the 33rd International Conference on Machine Learning. pp. 314-323.

SAS. 2023. Machine Learning: What it is and why it matters.

Shao, Y., Wang, Q. & Han, D. 2022. Efficient methods for convex problems with Bregman Barzilai–Borwein step sizes. Pacific Journal of Optimization 18(2): 333-348.

Shi, W., Shuib, A. & Alwadood, Z. 2023. Stochastic variance reduced gradient method embedded with positive defined stabilized Barzilai–Borwein. IAENG International Journal of Applied Mathematics 53(4): 1682-1687.

Tseng, P. 2000. A modified forward-backward splitting method for maximal monotone mappings. SIAM Journal on Control and Optimization 38(2): 431-446.

Wen, L., Cheng, Y., Fang, Y. & Li, X. 2023. A comprehensive survey of oriented object detection in remote sensing images. Expert Systems with Applications 224: 119960.

Xiao, L. & Zhang, T. 2014. A proximal stochastic gradient method with progressive variance reduction. SIAM Journal on Optimization 24(4): 2057-2075.

Yang, Z. 2024. SARAH-M: A fast stochastic recursive gradient descent algorithm via momentum. Expert Systems with Applications 238: 122295.

Yang, Z., Chen, Z. & Wang, C. 2021. Accelerating mini-batch SARAH by step size rules. Information Sciences 558: 157-173.

Yang, Z., Wang, C., Zhang, Z. & Li, J. 2019. Mini-batch algorithms with online step size. Knowledge-Based Systems 165: 228-240.

Yang, Z., Wang, C., Zhang, Z. & Li, J. 2018. Random Barzilai–Borwein step size for mini-batch algorithms. Engineering Applications of Artificial Intelligence 72: 124-135.

 

*Corresponding author; email: adibah253@uitm.edu.my

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

previous next